Mining Frequent Patterns Through Microaggregation in Differential Privacy
نویسندگان
چکیده
Frequent pattern mining has been widely employed to analyze transaction datasets, but the question of how sensitive information contained in a dataset should be protected remains remains relatively unanswered. The differential privacy model provides a robust privacy guarantee, but the k-anonymity model provides better dataset utility. In this paper, a synergetic approach is proposed to simultaneously protect privacy and enhance data utility when mining top-k frequent patterns. First, microaggregated data is released, which achieves kanonymity, regardless of the query types the user may be using. Second, top-k frequent patterns are selected based on microaggregated data using the exponential mechanism. Finally, the true support of each top-k frequent pattern is perturbed by adding Laplace noise.
منابع مشابه
Mining Frequent Patterns with Differential Privacy
The mining of frequent patterns is a fundamental component in many data mining tasks. A considerable amount of research on this problem has led to a wide series of efficient and scalable algorithms for mining frequent patterns. However, releasing these patterns is posing concerns on the privacy of the users participating in the data. Indeed the information from the patterns can be linked with a...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملA Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining
Mining frequent subgraphs from a collection of input graphs is an important task for exploratory data analysis on graph data. However, if the input graphs contain sensitive information, releasing discovered frequent subgraphs may pose considerable threats to individual privacy. In this paper, we study the problem of frequent subgraph mining (FSM) under the rigorous differential privacy model. W...
متن کاملConstrained Microaggregation: Adding Constraints for Data Editing
Privacy preserving data mining and statistical disclosure control have introduced several methods for data perturbation that can be used for ensuring the privacy of data respondents. Such methods, as rank swapping and microaggregation, perturbate the data introducing some kind of noise. Nevertheless, it is usual that data are edited with care after collection to remove inconsistencies, and such...
متن کاملA Study of Differentially Private Frequent Itemset Mining
Frequent sets play an important role in many Data Mining tasks that try to search interesting patterns from databases, such as association rules, sequences, correlations, episodes, classifiers and clusters. FrequentItemsets Mining (FIM) is the most well-known techniques to extract knowledge from dataset. In this paper differential privacy aims to get means to increase the accuracy of queries fr...
متن کامل